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In this article, we describe a test of the active time model for concurrent variable interval (VI) choice. 
The active time model (ATM) suggests that the time since the most recent response is one of the 
variables controlling choice in concurrent VI VI schedules of reinforcement. In our experiment, 
pigeons were trained in a multiple concurrent similar to that employed by Belke (1992), with VI 20-s 
and VI 40-s schedules in one component, and VI 40-s and VI 80-s schedules in the other component. 
However, rather than use a free-operant design, we used a discrete-trial procedure that restricted 
interresponse times to a range of 0.5-9.0 s. After 45 sessions of training, unreinforced probe periods 
were mixed with reinforced training periods. These probes paired the two stimuli associated with the VI 
40-s schedules. Further, the probes were defined such that during their occurrence, interresponse times 
were either “short” (0.5-3.0 s) or “long” (7.5-9.0 s). All pigeons showed a preference for the stimulus 
associated with the relatively rich VI 40-s schedule—a result mirroring that of Belke. We also observed, 
though, that this preference was more extreme during long probes than during short probes—a result 
predicted by ATM. 
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This experiment tested the hypothesis that 
choice under concurrent variable-interval (VI) 
VI schedules of reinforcement is controlled by 
the time since the most recent response. This 
time variable has been termed “active time,” 
and the model describing its relevance to 
choice behavior has been designated the active 
time model, or ATM (Cleaveland, 1999). ATM 
is a stochastic, molecular model that success¬ 
fully describes a range of choice behavior for 
pigeons responding on concurrent VI VI 
schedules of reinforcement (Brown & Cleave¬ 
land, 2009; Cleaveland 1999, 2008; McKenzie 
& Cleaveland, 2010). The model assumes that 
during training pigeons learn a function that 
relates active times to switches and stays into 
and out of choice “states.” With its emphasis 
on interresponse times and switches versus 
stays, ATM falls within a broad theoretical 
approach to choice behavior that is shared by 
models such as momentary maximization 
(Shimp, 1969) and the stay/switch model 
(MacDonall, 2009). Our test of ATM uses a 
discrete-trial multiple concurrent VI VI proce¬ 
dure. However, before describing our test in 
detail, we will first describe how ATM emerges 
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from the moment-to-moment contingencies 
arranged by concurrent VI VI schedules. 

Theoretical Framework 

Concurrent VI VI reinforcement schedules 
are usually arranged in the laboratory so that 
reinforcers occur at an average fixed rate and 
with a constant overall probability. This is 
accomplished by using an exponential distri¬ 
bution, as given in Equation 1 and frequently 
approximated with a procedure outlined by 
Fleshier and Hoffman (1962). 

Pi = 1 _ e -0i/Xd 

Equation 1 shows that the momentary 
probability of reinforcement at Choice 7, 
changes as a function of the time, 4 since last 
choosing Choice i and the average reinforce¬ 
ment interval, A,,-, assigned to Choice i. The 
contingencies of reinforcement for a VI 
schedule, then, explicitly target t,, the inter- 
esponse time (IRT) that separates choices at a 
particular schedule. In concurrent VI VI 
schedules every choice can be defined in 
relation to two such IRTs: active time and 
background time (Cleaveland, 1999). Active time 
corresponds to the time since the most recent 
schedule choice, while background time refers 
to the time since the alternative schedule was 
chosen. 
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Fig. 1. Active and background interresponse times. The figure assumes that a subject responds once every 5 s, and 
shows the relationship between interresponse times (IRTs) and reinforcement probabilities programmed according to 
Equation 1. In this case, the subject is shown to make two responses to the VI 20-s schedule, one response to the VI 40-s 
schedule, before switching back to the AT 20-s schedule. Notice that choosing an alternative immediately resets its 
probability of reinforcement to 0. Further, the figure highlights two types of IRT: active and background. As the subject 
responds, the time since the last peck to the alternative schedule, that is, the background IRT, increases. Conversely, the 
active IRT is defined as the time since the most recent choice, regardless of the alternative selected. 


Figure 1 illustrates our two classes of IRT for 
concurrent VI 20-s VI 40-s schedules, and, 
using Equation 1, relates them to reinforce¬ 
ment probabilities. For simplicity, Figure 1 
assumes a response every 5 s. As can be seen, 
reinforcement probabilities reset at an alter¬ 
native with every response to that alternative. 
After a reset, reinforcement probabilities 
negatively accelerate as a function of active 
time for the most immediately selected alter¬ 
native, and as a function of background time 
for the non-selected alternative. 

Given Figure 1, one might reasonably con¬ 
clude that organisms will discriminate among 
active and background times so as to choose 
whichever alternative had the momentarily 
higher probability of reinforcement. This 
hypothesis is termed momentary maximizing, 
and it does have some support (e.g., Hinson & 
Staddon; 1983a, 1983b; Silberberg, Hamilton, 
Ziriax, & Casey, 1978). However, a consistent 
finding in the literature is that animals do not 
become more likely to switch from an alterna¬ 
tive as background time increases (Cleaveland, 
1999; Heyman, 1979; Nevin, 1969). This 
failure to discriminate among changes in 
background times, however, should not come 
as a surprise. A concurrent VI VI procedure 
presents a complex interval timing structure. 
Background and active IRTs do not start in 


parallel, and while the background IRT is 
accumulating, the active IRT might accumu¬ 
late and reset several times. Also, the trigger 
initiating each IRT is not an external event, 
but a behavior that is common to both 
intervals (i.e., a key peck). Given this complex¬ 
ity, ATM proposes that temporal control 
during concurrent VI VI schedules is restricted 
to active IRTs. In this sense ATM is a 
constrained version of momentary maximiz¬ 
ing. 

If choice is controlled by a single, momen¬ 
tary temporal variable, however, then by 
necessity the relevant operants for ATM 
become switches and stays, rather than the 
selection of programmed reinforcement 
schedules per se. This is a viewpoint shared 
by MacDonall’s stay/switch model (Mac- 
Donall, 2000, 2003, 2009). In experiments 
with rats MacDonall has shown that “stays” 
and “switches” are allocated at molar levels so 
as to maximize the reinforcement probabilities 
for each of these classes of behavior. ATM adds 
a dynamic variable to this framework. That is, 
ATM claims that it is not just stays and switches 
that are reinforced but rather stays and 
switches at particular times. 

Figure 2, taken from Cleaveland (1999), 
shows how stay and switch responses are, in 
fact, mediated by active IRTs during concur- 
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Fig. 2. Active time functions for 3 subjects. The data shown here is taken from Cleaveland (1999), and shows the 
behavior of 3 pigeons in a discrete-trial, concurrent VI 60-s VI 180-s schedule of reinforcement. Both plots provide the 
probability of switching from a schedule given increasing, active IRTs. Plots on the left are for cases in which the most 
recent response was to the VI 60-s schedule, while the plots on the right are for cases in which the most recent response 
was to the VI 180-s schedule. Note that when the most recent response was to the relatively rich, VI 60-s schedule, birds 
were more likely to select the VI 60-s schedule again as active IRTs increased. In contrast, when the most recent response 
was to the relatively lean, VI 180-s schedule, birds were more likely to switch away from the VI 180-s schedule as active 
IRTs increased. 


rent VI VI schedules. The switch functions are 
from 3 birds that were trained under concur¬ 
rent VI 60-s VI 180-s schedules. The left-hand 
plots show the probability of a switching out of 
the VI 60-s schedule given changes in the 
active IRT. All of the subjects showed higher 
switch probabilities at shorter than longer 
active times. The right-hand plots show a 
different result. They provide switch probabil¬ 
ities from the relatively lean, VT 180-s schedule, 
and show that birds tended to increase switch 
probabilities as a function of active time. 

Testing Active IRT Control 

ATM, then, suggests that during concurrent 
VI VI procedures, birds come to pattern their 
stay and switch responses depending upon the 
active IRT. To test this hypothesis, the follow¬ 
ing experiment utilizes a procedure similar to 
Belke’s (1992) experiments. In Belke’s exper¬ 
iment, pigeons were trained under multiple 
concurrent VI VI schedules. Specifically, the 
birds experienced periods of a concurrent VI 
20-s VI 40-s schedule intermixed with periods 
of a concurrent VI 40-s VI 80-s schedule. 
Within pairings, Belke observed that his 
subjects’ choice behavior conformed to the 
well known matching law (Herrnstein, 1961). 
That is, during training, his subjects approxi¬ 
mated a 2:1 preference for the relatively richer 
schedule of a given pairing. After training, 
Belke intermixed unreinforced probe periods 
and reinforced normal periods. Probes paired 
stimuli associated with the two VI 40-s sched¬ 


ules, while normal periods continued the 
trained VI VI pairings. We will use a subscript 
to indicate the schedule with which a probe 
stimulus was trained. Belke observed that all of 
the birds showed a strong preference for the 
VI 40 8O stimulus when it was paired with the VI 
4 O 20 stimulus. This result, since replicated 
(Gibbon, 1995; Williams & Bell, 1996), inval¬ 
idates several choice models (see Williams, 
1994). 

Cleaveland (2008) pointed out that ATM, in 
principle, predicts a choice bias in the same 
direction as that observed by Belke (1992) in 
his probes. If the switching functions shown in 
Figure 2 are associated with discriminative 
stimuli, then Belke’s probes essentially pair a 
“rich” switch function with one that is “lean.” 
Such a pairing would cause a subject to switch 
more often from the VI 40ao stimulus than 
from the VI 40 go stimulus. McKenzie and 
Cleaveland (2010) tested this hypothesis in a 
procedure similar to Belke’s, and found that 
the observed switch functions could be used to 
accurately model the individual preferences 
obtained during probes. The following exper¬ 
iment extends the work of McKenzie and 
Cleaveland by noting that the functions in 
Figure 2 make a novel prediction. Namely, if 
ATM is correct, and functions such as those 
shown in Figure 2 determine choice, then it 
should be possible to control the preferences 
shown by pigeons during probes by controlling 
active time. This prediction is illustrated in 
Figure 3. 
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Hypothetical 

Trained Switch Functions 



1 



Fig. 3. Our predictions are based on the assumption 
that pigeons learn typical rich versus lean, active IRT 
switch functions. As noted in the text, after responding to 
a relatively rich VI schedule pigeons have been found to be 
more likely to switch after shorter than longer active IRTs. 
Conversely, after responding to a relatively lean VI 
schedule, pigeons usually show a high probability of 
switching across all active IRTs. In other words, at short 
active IRTs the difference between switching probabilities 
(dshort) for the relatively lean and rich schedules would be 
small in comparison to switch probabilities after long IRTs 
(diong). If switch probabilities determine overall prefer¬ 
ence at suggested by the active time model (ATM), then 
we would predict that pigeons would show more extreme 
schedule preferences after long active IRTs than after 
short active IRTs. Our experiment tests this hypothesis via 
probe trials in which a restricted portion of IRTs are 
allowed, and pigeons are given a choice between stimuli 
previously paired with a relatively rich VI 40-s stimulus (VI 
40 80 ) and a relatively lean VI 40-s schedule (VI 40 2o ). 

In the following experiment pigeons were 
trained under multiple concurrent VI 20-s VI 
40-s pairings and VI 40-s VI 80-s pairings. 
However, rather than utilize a free-operant 
procedure, we used a discrete-trial procedure. 
In a discrete-trial procedure, IRTs are under 
experimental control. In our case, IRTs were 
allowed to range from 0.5-9.5 s according to a 


Gaussian function with a mean of 5 s. After 
training, we exposed our birds to two types of 
probes: short and long. Short probes were 
restricted to IRTs of 0.5-3.0 s, while long 
probes were restricted to IRTs of 7.0-9.5 s. We 
predicted that if active IRTs mediate switch- 
and-stay responses, then preference during 
short probes would be less extreme than 
preference during long probes. This follows 
from the hypothetical functions shown in 
Figure 3. In Figure 3 the difference between 
rich and lean switch probabilities is less 
extreme at shorter than at longer active times. 
To our knowledge, our prediction that active 
time will influence preference during probes is 
unique to ATM. No other extant model of 
behavior under concurrent VI VI schedules of 
reinforcement makes such a prediction. 

METHOD 

Subjects 

Five adult white Carneaux pigeons were 
used as subjects. The birds were housed in 
individual cages (40 cm X 40 cm X 40 cm) 
located in a well-ventilated, brightly lit room 
with a 12:12 h light/dark cycle. The birds were 
kept at 85% of their free-feeding weights for 
the duration of the experiment, and had 
access to water and grit while in their cages. 
Food consisted of mixed grains in the exper¬ 
imental chamber and Purina pigeon chow 
mixed with mixed grains when in the home¬ 
room. On days in which subjects underwent an 
experimental session (5-6 times per week), 
their only access to food was the reinforcement 
received during the session. This experiment 
conformed to practices outlined in the Guide 
for the Care and Use of Animals. 

Apparatus 

Experimental sessions took place in a 36-cm 
X 34-cm X 34-cm stainless steel operant box 
(32 X 34 X 34 with grid flooring inserted) 
located in a dimly lit room. A Plexiglas clasp 
door allowed access into the interior of the 
box. Opposite this door was a panel with three 
translucent pecking keys, a small 24-V light 
bulb, and a hopper that allowed access to grain 
during reinforcement. The hopper was locat¬ 
ed at the vertical median of the panel, 8 cm 
from the grid floor, and was 2 cm in diameter. 
There was also a white 24-V light bulb 16 cm 
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from the top of the box and directly above the 
hopper, which was illuminated during the 
delivery of reinforcement. The three pecking 
keys were equally spaced from each other, 
7.25 cm apart, 7 cm from the top, and 6 cm 
from the nearest side-edge to the left and 
right. The diameter of each key was 2.6 cm. 
Stimuli were projected on to the back of the 
pecking keys by means colored 24-V light bulbs 
mounted behind the keys. 

Procedure 

Training sessions. All birds experienced 45 
sessions of multiple concurrent VI VI sched¬ 
ules with each bird being run 5-6 days per 
week. Schedule pairs consisted of VI 20-s VI 40- 
s and VI 40-s VI 80-s schedules with reinforce¬ 
ment at each schedule programmed according 
to Equation 1. Schedules were arranged across 
the three keys such that the right and left keys 
were only associated with the VI 40-s schedules. 
The center key was associated with either the 
VI 20-s or VI 80-s schedule, depending on the 
schedule pairing. In addition, each individual 
schedule was associated with a white, red, 
green, or blue color that was projected onto 
the back of the keys. One of the VI 40-s 
schedules was paired with a white stimulus on 
the leftmost key; the other VI 40-s schedule was 
paired with a blue stimulus on the rightmost 
key. The VI 20-s and VI 80-s schedules were 
paired with either a red or a green stimulus, 
with 2 birds experiencing one pairing and 3 
experiencing the other. Reinforcement con¬ 
sisted of 1.8 s of access to food from the 
hopper. During the reinforcement period, the 
light above the hopper was illuminated. 

Each training session consisted of twenty 
120-s choice periods separated by 15-s timeouts 
in which all of the key lights were extin¬ 
guished. During a choice period, one of the 
two pairs of concurrent VI VI schedules was 
randomly selected and the appropriate key 
lights illuminated. Subjects were permitted 2 s 
to peck either of the illuminated keys. After a 
response, after a response followed by rein¬ 
forcement, or after 2 s without a response, the 
key lights were extinguished for an interre¬ 
sponse interval. This interval was determined 
by a Gaussian distribution with a mean of 5 s 
and a variance of 2. Intervals were further 
constrained to a range of 0.5-9.5 s by resam¬ 
pling the Gaussian distribution if an interval 
fell outside of this range. If the subject 


responded during the interresponse interval, 
then the timer for the interval was reset. At the 
end of an interval, that choice period’s 
schedule keys were reilluminated, and the 
subject was permitted another choice. 

As noted, reinforcers were programmed 
according to Equation 1. In other words, and 
in contrast with many other procedures, 
reinforcers were never “held” for a subject 
either within a 120-s choice period, or between 
periods. Rather, the amount of time that had 
passed since an alternative was last chosen (t, 
in equation 1) directly determined the prob¬ 
ability of reinforcement for that particular 
choice. These times were always set to 0 for 
both alternatives at the start of a 120-s choice 
period. During the reinforcement of an 
alternative, the IRT for that alternative did 
not accrue, and remained at 0 for the duration 
of the reinforcer delivery. However, the IRT 
for the nonreinforced alternative continued to 
accrue for the duration of a reinforcer 
delivery. 

Probe sessions. After 45 training sessions, 
probe sessions were introduced. A total of 
four probe sessions were administered. Be¬ 
tween probe sessions, each subject underwent 
three training sessions as outlined above. 

Each probe session consisted of thirty 120-s 
choice periods. Twenty of these periods were 
the same as the previously trained multiple 
concurrent VI VI schedules, while 10 were 
probe periods during which no reinforcement 
was delivered. Probe periods consisted of 
pairings of the illumination of the schedule 
keys used for the two VI 40-s schedules (i.e., 
the leftmost and rightmost keys). 

Two types of probe periods were defined. 
These types were characterized by the length 
of the interresponse intervals. Short probes 
restricted interresponse intervals to 0.5 -3.0 s; 
long probes restricted interresponse intervals 
to 7.0-9.5 s. These intervals were drawn from 
the same Gaussian function that defined the 
training intervals, but restricted to the desired 
range. In an attempt to control for unequal 
extinction effects, long probe periods were 
programmed to occur with approximately 
twice the frequency of short probe periods. 
Of the 40 total probes periods experienced by 
each subject, 13 were short probes, while 27 
were long probes. Within a probe session, 
probe types were selected randomly from a 
predetermined list. However, probe periods 
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Table 1 

Comparision of Choice Proportions Using Obtained Data 
and Simulations of ATM. 



Subject 

Training 

Obtained 

Data 

ATM 

Simulations 


6 

0.82 

0.83 


7 

0.73 

0.66 

VI 20 VI 40 

8 

0.70 

0.71 


9 

0.61 

0.64 


10 

0.74 

0.82 


Average 

0.72 

0.73 


6 

0.65 

0.68 


7 

0.74 

0.73 

VI 40 VI 80 

8 

0.65 

0.57 


9 

0.62 

0.63 


10 

0.56 

0.53 


Average 

0.64 

0.63 



Probes 




Obtained 

ATM 


Subject 

Data 

Simulations 


6 

0.65 

0.42 


7 

0.53 

0.47 

Short 

8 

0.48 

0.51 


9 

0.50 

0.44 


10 

0.58 

0.59 


Average 

0.55 

0.49 


6 

0.84 

0.72 


7 

0.73 

0.77 

Long 

8 

0.65 

0.62 


9 

0.62 

0.58 


10 

0.70 

0.64 


Average 

0.71 

0.67 


Note. In all cases, the data provided indicate the degree 
of preference for the relatively rich schedule or stimulus. 
Simulations utilized the programmed Gaussian IRT 
function and the switch functions provided in Figures 4 
and 7. 


and training periods were intermixed such 
that probes were always separated by at least 
one, but no more than three, training peri¬ 
od^). 

RESULTS 

Overall choice proportions were calculated 
for each schedule pairing across the last 15 
sessions of training (see Table 1). For the 
relatively rich schedule these proportions 
ranged from .64-83 in the VI 20-s VI 40-s 
pairing (mean = .73, sd = .08), while in the VI 
40-s VI 80-s pairing choice proportions ranged 
from .56-.74 (mean = .64, sd = .07). As a 
group the pigeons conformed reasonably well 


to a 2 :1 preference for the relatively rich 
schedule, although individual birds did show 
undermatching and overmatching (Baum, 
1974, 1979). Further, despite differences in 
the group averages, a two-tailed /-test did not 
reveal a significant difference between overall 
choice proportions for the two schedule 
pairings, t( 4) = 1.6, p = .15. 

Figure 4 provides the proportion of switches 
for each subject within binned active times 
drawn from the .5-9.5 s range defined by our 
Gaussian function. Bins sizes had a range of 1 s 
except for the longest and shortest bins, which 
collected active times over a range of 2.5 s (i.e. 
from .5-3.0 s and from 7.0-9.5 s). Proportions 
were calculated over the last 15 sessions of 
training and were not considered valid unless 
the total responses collected for a bin were 
greater than 20. Plots on the left provide 
switch proportions out of the relatively rich 
alternative, while plots on the right provide the 
switch proportions out of the lean alternative. 

Figure 4 shows that switch proportions 
varied as a function of the active schedule 
IRT. All birds were more likely to switch from 
the relatively rich alternative after a short IRT 
than after a long. After active times of .5-3.0 s, 
for example, Bird 6 was three times more 
likely to emit a switch than a stay when in the 
VI 40so-s schedule. After times of 7.0-9.5 s, 
though, this same bird was three times more 
likely to emit a stay than a switch when in the 
same schedule. In contrast, switching out of 
the relatively lean schedules either increased 
in frequency or remained relatively flat as a 
function of active IRT duration. Bird 6 was 
about equally likely to switch or stay from the 
VI 8 O 40 -S schedule during the shortest bin 
intervals. At the longest bin interval, this same 
pigeon was approximately twice as likely to 
switch. 

Figure 4 illustrates that the differences 
between relatively rich and lean functions 
were more pronounced at longer than at 
shorter active times. Figure 5 examines this 
pattern in more detail by considering the 
difference between the switching functions for 
the two VI 40-s schedules (i.e., the relatively 
rich VI 40go and the relatively lean VI 402o). 
For simplicity only differences at the shortest 
and longest time bin are considered. Figure 5 
clearly shows that the difference between these 
two active time functions was greater at the 
longest time bin than at the shortest time bin. 
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Rich Schedule Lean Schedule 




Time Bins Time Bins 


VI 20-s to VI 40-s O VI 40-s to VI 20-s 

■* VI 40-s to VI 80-s VI 80-s to VI 40-s 


Fig. 4. Functions showing the probability of a switch given different active IRT bins during the last 15 sessions of 
multiple concurrent VIVI training. Time bins ranged from IRTs less than 3 s to IRTs greater than 7 s. From top to bottom 
plots are for Birds 6-10, respectively. Plots in the left column (filled symbols) indicate cases in which the most recent 
response was to the relatively rich schedule of a concurrent pair. Plots in the right column (empty symbols) indicate cases 
in which the most recent response was to the relatively lean schedule of a concurrent pair. Circles specify the VI 20-s VI 
40-s pair, while squares specify the VI 40-s VI 80-s pair. 


Within a range of .5-3.0 s, the probability 
differences ranged from —.20 to .22 (mean = 
.09). For the two longest bins, probability 
differences ranged from .16 to .56 (mean = 
.43). The stimuli associated with these rein¬ 


forcement schedules were, of course, not 
paired during training. However, comparing 
these active time functions allows us to make 
predictions for our probes. For example, when 
the difference is negative (i.e., proportion of 
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Fig. 5. Differences between the switch functions provided in Figure 4. Specifically, the figure shows the difference 
between the relatively rich VI 40 8O and the relatively lean VI 40 2 o functions and the shortest and longest active IRT time 
bins. Bars left to right indicate Birds 6-10, respectively. The function differences permit a qualitative prediction of ATM 
for probe tests that will pair the stimuli associated with the VI 40-s schedules (see Figure 3). ATM predicts that small 
differences would correspond to indifference during probe tests. Negative values correspond to preference for the VI 
40 2 o stimulus, while positive values correspond to preference for the VI 40 8 <j stimulus. Thus, Figure 5 leads to the 
prediction that our Long probes will produce a much greater preference for the VI 40 8O stimulus than will be shown 
during our Short probes. 


switches out of VI 40go > the proportion out of 
VI 4 O 20 ), then ATM predicts a preference for 
the relatively lean VI 40 2O stimulus. Conversely 
a positive result would lead to the prediction 
of a preference for the relatively rich VI 40 8O 
stimulus. Taken together, Figure 5 leads to the 
prediction that Tong probes (IRTs ranging 
7.0-9.5 s) will produce more extreme prefer¬ 
ence for the VI 40 80 stimulus than Short 
probes (IRTs ranging .5-3.0 s). Such a 
prediction is borne out in Figure 6. 

Figure 6 presents the proportion of respons¬ 
es made to the VI 40 8o stimulus during Short 
and Tong probes. This data shows that during 
Long probes subjects preferred the VI 40 8O 
stimulus (mean = .71, sd = .08) to a much 
greater degree than during Short probes 
(mean = .55, sd = .07). This was true of every 
subject, and a one-way paired /-test of the data 
found the result to be significant, t( 4) = 
-3.27, p < .001. 

To more tightly link the switch functions 
shown in Figure 4 and the choice proportions 
shown by our subjects, Monte Carlo simula¬ 
tions of training and of the probe tests were 


conducted for each subject. These simulations 
consisted of two parts. First, an IRT generator 
determined when a response would occur. 
This generator was simply defined by the same 
Gaussian function used in our experiment for 
determining discrete-trial intervals. For train¬ 
ing simulations, the range of IRTs was .5-9.5 s. 
For the probe simulations the range was set at 
.5-3.0 s for Short probes and at 7.0-9.5 s for 
Long probes. After determining when a 
response would occur, the appropriate switch 
function given in Figure 4 was used to deter¬ 
mine whether this number resulted in a switch 
or stay response. Simulations ran for a total of 
5,000 responses each, and the results are 
shown in Table 1. 

In terms of the schedule preferences ob¬ 
tained during the last 15 sessions of training, 
simulated preferences were statistically indis¬ 
tinguishable from the actual, obtained data. 
Further, the simulated data approximated the 
rank order observed in the actual data. During 
VI 20-s VI 40-s training, Subject 6 produced the 
most extreme preference, while Subject 9 
produced the least extreme preference. Simi- 
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Fig. 6. Obtained probe preferences. The bar plots provide the proportion of responses made to the VI 40 80 stimulus 
during Short and Long probe intervals. Bars left to right indicate Birds 6-10, respectively. Note that the x-axis crosses at 
0.5, which would correspond to an equal number of responses being made to the VI 40 8o and the VI 40 20 stimuli (i.e., 
indifference). Therefore, proportions less than 0.5 indicate preference for the VI 40 2o stimulus, while proportions greater 
than .5 indicate a preference for the VI 40 8o stimulus. As predicted by ATM. Figure 6 shows that preference for the VI 
40 80 stimulus became more extreme during Long probe intervals as compared with Short probe intervals. 


larly, during VI 40-s VI 80-s training, Subjects 7 
and 10 were the “high” and “low” subjects, 
respectively, in terms of both obtained. Our 
simulations generated similar rankings. More 
importantly, though, our simulations mirrored 
the results obtained from our probe tests. As 
can be seen in Table 1, simulations of the VI 
40 2 o VI 40 8O probes produced less extreme 
preferences during Short probes than during 
Long probes (.49 and .65, respectively, in favor 
of the VI 40go) • That is, using switch functions 
obtained from the end of training allowed for 
predictive simulations of novel stimulus com¬ 
binations for individual subjects. 

Finally, recall that our procedure biased 
probe presentations by 2:1 in favor of the Long 
probes. This was done in order to account for 
the greater overall responding permitted 
during Short probes. Since all probe trials 
were 120 s in duration, it follows that more 
responding would occur in any given Short 
probe, when IRTs were restricted to .5-3.0 s, 
than in any given Long probe, when IRTs were 
restricted to 7.0-9.5 s. Given our 2:1 bias in 
probe presentations we obtained the following 
response totals. For subjects 9 and 10, the total 
number of responses during short and long 
probe sessions was approximately equal (252 
vs. 240 and 285 vs. 287, respectively). Subjects 
6 and 7 made more responses during short 


probe sessions than during long probe ses¬ 
sions (392 vs. 273; 264 vs. 196, respectively), 
while Subject 8 responded more during long 
probe sessions (109 vs. 142). 

DISCUSSION 

Our experiment tested the hypothesis that 
choice under concurrent VI VI schedules is 
partially determined by the time since the 
most recent response. This variable has been 
termed “active time” and lies at the core of a 
model termed the active time model, or ATM 
(Cleaveland, 1999, 2008). To test this hypoth¬ 
esis, pigeons were trained with a discrete-trial 
procedure under multiple concurrent VI 20-s 
VI 40-s and VI 40-s VI 80-s schedules. At the 
end of this training, all of our birds showed 
regularities relating active time to choice 
behavior (Figures 4 and 5), and these regular¬ 
ities allowed us to make predictions regarding 
preference during novel stimulus pairings. 
These pairings consisted of unreinforced 
probes that paired the two VI 40-s stimuli in 
a manner similar to Belke (1992). Our 
prediction, drawn from the active time func¬ 
tions given in Figure 4, was that choice for the 
VI 40 go stimulus would be positively correlated 
with active IRTs. This prediction was borne 
out. Birds chose the VI 40go stimulus over the 
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VI 40 2 o stimulus significantly more often 
during long, as opposed to during short, 
probes (Figure 6). To our knowledge, ATM is 
the only choice model that makes this predic¬ 
tion. 

It must be noted, though, that the data in 
support of ATM are but a late addition to a 
growing set of findings suggesting that subjects 
learn patterns of stay-and-switch responses 
during concurrent schedules of reinforce¬ 
ment. Indeed, some of the earliest support 
for momentary maximization models of choice 
(e.g., Shimp, 1966, 1969) came from findings 
of response patterns during concurrent VI VI 
schedules. For example, assuming that a 
subject responds at a constant rate, concurrent 
VI VI schedules yield the greatest probability 
of reinforcement at each response if the 
emitted response sequence is equivalent to 
the ratio of the schedule values. Figure 1 
illustrates this fact. For a VI 20-s VI 40-s 
schedule, the optimal response sequence 
consists of two responses to the VI 20, followed 
by a single response to the VI 40. In fact, 
pigeons do approximate the optimal response 
sequence when responding under concurrent 
VI VI schedules of reinforcement (Silberberg 
et al., 1978). 

However, the learned patterns of respond¬ 
ing suggested by momentary maximization are 
assumed to emerge from a comparison of the 
relative value (in terms of momentary rein¬ 
forcement probabilities) of the concurrent 
schedules of reinforcement. In contrast, more 
recent results indicate that learned patterns of 
responding are to some degree independent 
of the relative value of the underlying sched¬ 
ules of reinforcement (Gibbon, 1995; Mc- 
Devitt & Bell, 2008; Williams & Bell, 1996). 
For instance, Williams and Bell conducted an 
experiment that utilized multiple, concurrent 
VI 20-s VI 40-s and VI 40-s VI 80-s schedules of 
reinforcement just as used by Belke (1992) 
and in the current study. However, their 
procedure signaled the arrival of reinforcers 
at the VI 20-s schedule. Such signaling caused 
the birds to spend considerably more time 
responding at the VI 40 2 o stimulus during 
training. In probes that paired the VI 40 2o and 
VI 40 8o stimuli, Williams and Bell found that 
their subjects preferred the VI 40 2O stimulus. 
In other words, it appeared that stay-and- 
switch patterns established during training 
determined the probe results. 


In terms of theory, the sequential dependen¬ 
cies observed under concurrent schedules of 
reinforcement have led some to focus on 
whether the appropriate response unit in such 
schedules is simply “choices of VI X or VI y ”. For 
example, Machado (1992) used frequency- 
dependent schedules, of which concurrent Vis 
are a subset, that differentially reinforced 
relative choice frequencies. What he found was 
that when response units were defined in terms 
of a single stay or switch, pigeons quickly 
learned to maximize reinforcement by alternat¬ 
ing L (left) and R (right) pecks. Further, when 
response units were defined in terms of two 
responses—LL, LR, RL and RR—some of the 
pigeons learned to maximize reinforcement by 
emitting each pair once, in sequence (e.g., 
RRLLRLLR). Similarly, MacDonall’s stay/switch 
model (MacDonall, 2000, 2003, 2009) is a molar 
model of choice in which stays and switches are 
the reinforced response units. In experiments 
with rats MacDonall has shown that stays and 
switches appear to be allocated at molar levels so 
as to maximize the reinforcement probabilities 
for each of these classes of behavior. 

Our results reported in this article, then, 
comfortably fit with a broad range of data. 
Animals appear to learn patterns of respond¬ 
ing under concurrent schedules of reinforce¬ 
ment, and these patterns are at least partially 
independent of the absolute value of the 
underlying schedules. Our contribution is to 
show that such patterns of responding have a 
temporal component determined by the active 
IRT. Under concurrent VI VI schedules of 
reinforcement, our subjects showed switch 
patterns that correlated with active IRTs. 
Further, we found that these switch patterns 
carried over to unreinforced probe trials that 
utilized novel stimulus combinations. Future 
work, therefore, will need to focus on the 
origins of active IRT switch functions under 
concurrent schedules of reinforcement. 
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